An integrated model of traffic, geography and economy in the Internet 
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Modeling Internet growth is important both for understanding the current network and to predict and improve 
its future. To date, Internet models have typically attempted to explain a subset of the following characteristics: 
network structure, traffic flow, geography, and economy. In this paper we present a discrete, agent-based model, 
that integrates all of them. We show that the model generates networks with topologies, dynamics, and (more 
speculatively) spatial distributions that are similar to the Internet. 



I. INTRODUCTION 

As one of the most complex human constructions, the Inter- 
net is a challenging system to model. Dynamic processes of 
different time-scales operate simultaneously — from slow pro- 
cesses, like the development of new hardware, to the transport 
of data, which occurs at the speed of light. 

These phenomena are to some extent interdependent. Traf- 
fic provides income to the service providers which is then in- 
vested in infrastructure, which can lead to changes in traffic 
patterns. This paper describes an agent-based model (ABM) 
that attempts to reproduce large-scale features of the Au- 
tonomous System (AS) level of the Internet by modeling lo- 
calized and well-understood network interactions. The ASes 
of the Internet lend themselves naturally to discrete ABM 
models (4). Each AS is an economic agent, comprised of a 
spatially discrete network. Over time, ASes create new links 
to other ASes, upgrade their carrying capacity, and compete 
for customer traffic. The agents in the model described here, 
behave similarly, although we have simplified as much as pos- 
sible. Specifically, the model is designed to be both simple 
and general enough to simulate any spatially extended com- 
munication network built by subnetworks of economically 
driven agents. 

In previous work, Chang et cd. showed that incorporating 
economics and geography into the Highly-Optimized Toler- 
ance (HOT) (6) model increases the model's accuracy (|7|). A 
related ABM model of the AS graph produces degree distribu- 
tions similar to empirical observations (8). Bar et al. proposed 
a similar model (2), that incorporates another aspect of the 
real Internet — that the agents are spatially extended objects. 
Our model is similar in scope to this earlier work but differs 
in the details, most importantly by adding explicit economics 
(cost) to the model. Other differences include accounting for 
population density, simplifying the treatment of traffic flow, 
and not assuming a HOT framework. The previous work in 
this area, like much research on network models, focuses al- 
most exclusively on degree distributions of the graphs. In this 



paper, we compare our results to Internet data using several 
topological measures (19), including degree distributions, as 
well as geography and traffic dynamics. 

The remainder of the paper is organized as follows. First, 
we describe and motivate the model. Then, we characterize 
the time evolution, network topology, correlation between net- 
work structure and traffic flow, packet routing statistics, and 
geographical aspects of the networks produced by the model. 
Where possible, we compare the properties of these synthetic 
networks to observed data from the Internet. 



II. AS SIMULATION MODEL 

We begin with the fundamental unit responsible for network 
growth, an agent with economic interests (1151) . These agents 
manage traffic over a geographically extended network (which 
we refer to as a sub-network to distinguish it from the network 
of ASes) and profit from the traffic that flows through their 
network. 

We compare the agents to the ASes that comprise the In- 
ternet. This is not an exact mapping — some of the Inter- 
net Service Providers (ISPs) have many AS numbers (e.g., 
AT&T), while other ASes are shared by several organizations. 
We make the common simplifying assumption that once an 
agent is introduced, it does not merge with another agent or 
go bankrupt dH H2; HD). This is partially justified by the fact 
that the Internet, from its inception, has grown monotonically, 
and we seek to capture this dynamic in our model Most other 
models of the AS graph enforce strict growth (22j) as well and 
are, as ours, justified by their a posteriori ability to reproduce 
measured features. 

We assume a network user population distributed over 
a two-dimensional area. Traffic is simulated by a packet- 
exchange model, where a packet's source and destination are 
generated with a probability that is a function of the popu- 
lation profile. The model is initialized with one agent com- 
prised of a network (a sub-network in our terminology) that 
spans one grid location (referred to as a pixel of the land- 
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FIG. 1 Illustration of the network growth algorithm, (a) shows the 
locations of four agents on the geographic grid. These are assumed to 
be connected by a physical network administrated by the agent, but 
is not explicit in the model, (b) is an example graph resulting from 
(a). That two agents are present in the same pixel is a necessary, but 
not sufficient condition for a link to form between the agents, (c) 
illustrates the area that each hypothetical agent can afford to expand 
to (the shaded region). 



scape. As time progresses, the agent may extend its sub- 
network to other pixels, so that the sub-networks reach a 
larger fraction of the population. This creates more traf- 
fic, which generates profit, which is then reinvested into fur- 
ther network expansion. Through positive feedback, the net- 
work grows until it covers the entire population. In this 
section we describe the assumptions and most of the de- 
tails of the model; the source code is publicly available from 
www.csc.kth.se/~pholme/asim/. 

An agent i is associated with a set of locations A,- (repre- 
senting sources or end-points of traffic, and peering points), a 
capacity Kj (limiting the rate of packets that can pass through 
the agent), a packet-queue Q;> an d a set of neighbor agents 
F,. A necessary, but not sufficient, condition for two agents 
to be connected is that their locations overlap at, at least, one 
pixel. The locations exist on an L x x L y square grid. A pixel 
of the grid is characterized by its population p(x, y) and the 
set of agents with a presence there J[{x,y). The total number 
of agents is denoted by n, and the number of links between 
agents by m. These quantities, except L x and L v , depend on 
the simulation time. The outer loop of the model then iterates 
over the following steps: 

1. Network growth. The number of agents is increased. 
Existing agents expand geographically, and their capac- 
ities are adjusted. 

2. Network traffic. Packets are created, propagated toward 
their targets, and delivered. This process is repeated 
Mraffic times before the next network-growth step. 

We measure simulation time r as the number of times StepQ] 
is executed (the time unit between packet movements is 
1 /iVtraffic)' In the remainder of this section we describe the 
growth and traffic steps in greater detail. 



A. Network growth 

The income of an agent, during a time step, is proportional 
to the traffic propagated by the agent during the period. This 
is a simplification — in a more detailed simulation one could 



let the income depend both on the amount of traffic, and the 
prices for forwarding the packets set by business agreements. 
Assume an agent i has a budget B, that it tries to invest so that 
it can increase its traffic, and thus its profit. Since there is a 
possibility of congestion in the model, agent i tries first to re- 
move bottlenecks by increasing its capacity Kj (the number of 
packets that the agent can transit during one time step). When 
the capacity is sufficient, the agent spends the rest its budget 
on increasing its traffic by expanding geographically. There 
are three prices associated with network growth. First, the 
capacity price C capac it y — the price of increasing Kj one unit. 
For simplicity we let C capac i ty be independent of the size of the 
agent's subnetwork. Second, the wire price C w i re . This is the 
price per pixel between a new location and the agent's closest 
existing location. Last, the cost C conne ct to connect two agents 
with locations at the same pixel. 

It has been observed that the average degree (number of 
neighbors of an AS) in the AS graph is relatively constant over 
time dl It I22I) . We take this as a constraint in our model and 
let the desired average agent degree ko be a control parameter. 
We also assume that each agent tries to spend all of its budget, 
but not more than that, whenever it is updated. 

The network growth step iterates over the following steps: 

1 . Increase of the number of agents. As long as the net- 
work is too dense (i.e. if 2m > kori), new agents are 
added. New agents are situated in the pixel (x,y) that 
has the highest available population p(x, y)/(A(x, y) + 1) 
where A(x,y) is the cardinality of &l(x,y) and A(x,y) > 
1. The budget and capacity of the new agents are ini- 
tialized to B; n it and respectively. 

If the network is small, n < ko + 1 , it is not dense enough 
for new agents to be added in stepQ] Thus, we do not 
apply this condition when n is less than a threshold no 
and call the time when n — no is reached to. 

2. Capacity increase. Each agent synchronously increases 
its subnetwork's capacity based upon traffic from the 
last time step (but not more than the agent can afford). 
Agent i invests the minimum of (fi,, C capac i ty Ar;, 0, 0) 
to increase capacity (AT 1 , is the change in traffic propa- 
gated by i since the last update). 

3. Link addition. While 2m < nko (which usually means 
ko - 1 times), choose two agents randomly that are not 
already connected and share a common pixel. If the 
budgets of both agents are larger than C connect , then con- 
nect them. 

4. Spatial extension. Let the agents with remaining bud- 
get to spend extend their networks. Iterate through all 
agents i and add a location at the pixel, not in A,, that 
has the highest available population p(x,y)/(L(x,y) + l), 
and is not further than (B, - C conne ct)/Cwire from a loca- 
tion in A, (i.e., not further from i than i can afford). (See 
FigureQJb)). An alternative location selector might se- 
lect the point which has the lowest cost per unit of pop- 
ulation. Unfortunately, such an algorithm is computa- 
tionally prohibitive for modeled networks of the Inter- 
net's scale. 
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FIG. 2 Illustration of traffic simulation, (a) A packet is created with 
source pixel s and target pixel / with probability proportional to the 
product of populations at s and t. One of the agents at the target pixel 
is randomly chosen as the target agent. The propagation of the packet 
is shown in the graph. Each agent i is associated with a queue g, and 
a capacity A",. When a packet reaches an agent, it is appended to 
Qi. Kj packets in the queue are relayed to neighboring agents and i's 
budget is credited one unit. The arrows in (b) symbolize the packet's 
route from source to destination agent. The package is routed to a 
neighboring agent j with probability exp((d(i, f) - d(j,t))/A (where 
/ is the packet's target, d( ■ , • ) gives the graph distance, and A is a 
parameter). 

The cost of each agent modification mentioned above is im- 
mediately deducted from the budget of the agent. 

B. Network traffic 

We model traffic with a discrete, packet-exchange 
model (fl2l : [l8h . The packets are generated with specific 
source and target pixels, but the routing takes place on the 
network of agents. We neglect intradomain routing between 
the agent's locations, assuming the time it takes for a packet to 
pass through an agent is independent of the specific locations 
it visits. The dynamics are defined as follows: 

1 . Packet generation. We assume that most traffic origi- 
nates from direct communication between individuals 
and does not depend on the distance between them. So, 
for each pair of points [(x,y),(x',y')] on the grid, we 
create a packet with source (x, y) and destination (x',y') 
with probability P p k g p(x, y) p(x',y'). Then one agent, 
selected at random from the agents with a location at 
the pixel, is made the source node for the packet. The 
destination agent is randomly chosen from the agents at 
the destination pixel. Finally, one unit of credit is added 
to the sender's budget. 

2. Packet propagation. Each agent i propagates the first K, 
packets from its queue (of length /,) each time step and 
receives one unit credit for each propagated packet. A 
packet can propagate only one hop (inter- AS transmis- 
sion) per time step. A packet at agent i is propagated 
to a neighbor j with probability exp(A(d(i, t) - d( j, t)) 



Parameter 


Interpretation 


Value 


L x = Ly 


Number of pixels in the x (and y) direction 


50 


■^traffic 


Number of packets sent per simulation step 


1 x 10 4 


^pkg 


Constant to determine packet source and dest. 


0.001 


n 


Agent growth threshold 


35 


^init 


Initial capacity of an agent 


5 


Cwire 


Price per pixel for new wire 


500 




Initial budget for a new agent 


3 x 10 5 


A 


Parameter in exponential distribution 


75 



TABLE I Default parameters values for simulation experiments. 



(where t is the recipient AS, d( ■ , • ) is the graph dis- 
tance, and A is a parameter controlling the deviation 
from shortest-path routing (f25l) observed in Ref. (fl6h). 

3. Packet delivery. For all agents, delete all packets that 
have reached their target. 

The assumption, in step Q] that the probability that two 
agents communicate is independent of their spatial separation 
is in line with the (somewhat debated) "death of distance" in 
the Internet age (5). We also tested communication rates that 
decay with the square of the distance, as observed in conven- 
tional trade firms (1201) . with qualitatively the same results. 

Business agreements between ASes are an important fac- 
tor in the Border Gateway Protocol (BGP) (23) (the Inter- 
net's largest scale routing protocol). Next hops are often se- 
lected by cost, rather than path length. We do not explicitly 
include inter-AS contractual agreements, but our probabilis- 
tic propagation method [2] has a similar effect on average path 
length (Hi). 



III. NUMERICAL SIMULATIONS 
A. Parameter values 

Before presenting the simulation results, we describe the 
experimental design, and choice of parameters. First, we spec- 
ify a population profile p(x,y). We primarily model popula- 
tion distributions, but we also model specific geographic pop- 
ulations (e.g. U.S.A. census data). To simplify the generation 
of population distributions, we neglect spatial correlations and 
simply model the frequency of population densities. This fre- 
quency has two important features: it is skewed (pixels with 
low population densities are more frequent than highly popu- 
lated pixels) and fat-tailed (there are pixels with a population 
density many orders of magnitude larger than the average). 
One probability distribution with such features is the power- 
law distribution Prob p ~ p~ x . To reduce the fluctuations 
between different realizations of \p{x,y)}, and prevent unreal- 
istically high populations within a pixel, we sample the power- 
law distribution in the bounded interval [1, (L x L y ) w -*>] © 
with^- = 3. Our results do not depend strongly on the distri- 
bution p(x,y). We obtain qualitatively similar result with nor- 
mally distributed /7-values and real population-density maps 
(data not shown). 
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FIG. 3 Time evolution of an example run. In panel (a) the number of 
agents and the number of inter-agent links as a function of simulation 
time. In (b) the fraction of the landscape with network coverage, 
and the fraction of the population reached by the network, is plotted 
against time. Panel (c) shows the average travel time (t p ) for packets 
and the average distance (number of inter-agent hops) in the network 
(d), as functions of the number of agents. 



In multiparameter, agent-based models, such as ours, a sys- 
tematic investigation of the full parameter space is infeasible. 
Parameters are, if possible, obtained from real systems. We 
set the desired degree ko = 5.52 as observed in Ref. 
Unless otherwise stated, the desired size of the network is 
no — 16,000, which is the same order of magnitude as the real 
AS graph. Other parameters are balanced to keep runtime low 
(less than one day) while still engaging all aspects of the algo- 
rithm. This means, for example, that between every network 
update, a significant number of packets are routed through 
even the smallest agents, and enough packages to cause con- 
gestion pass through larger agents. Unless otherwise stated, 
we use the parameter set given in Table U Many of the re- 
sults we show are from a single run, we have confirmed that 
the results are representative by comparing them with 20 other 
runs. 



duce maintenance costs proportional to network size, in which 
case the network would reach a steady state where the budgets 
of the agents are balanced and no further investments can be 
made. For t > 1.9 x 10 6 the increase of n(j) is slower than 
exponential. This is explained by the increasing level of con- 
gestion in the system. In Fig. [3jc) we plot the average time 
(t p ) for a packet to travel from source to destination. (t p ) is 
bounded from below by the average distance (number of links 
in the shortest path, averaged over pairs of nodes) (d). The two 
curves diverge, i.e. a significant level of congestion appears, 
around N = 1000. The growth of «(r) and mij) slows down 
at the same point. We conclude that growth slowdown comes 
from a congestion-driven negative feedback. The most strik- 
ing feature of network growth over time is the transition from 
a small network, almost constant in size, to a rapidly increas- 
ing system (around r ~ 1.8 x 10 6 ). This effect is typical for 
technologies emerging from the interactions of a large number 
of agents — they need a critical mass of users to reach a signif- 
icant fraction of the total population. One can argue that the 
Internet reached this critical mass in the early 1980's when it 
started to span the globe. Another important point in the Inter- 
net's history was the advent of the World Wide Web (WWW) 
in the early 1990's, and with it commercial applications and 
access to the general public. Our model does not include ap- 
plications, such as the WWW, that undeniably affect network 
growth. Such effects could be included by adopting a differ- 
ent traffic model, but for this paper we aim at simplicity and 
generality. In the Internet the growth of the number of ASes 
is slower than the exponential increase of agents predicted 
by the model (bgp.potaroo.net/cidr/; read January 7, 2008). 
One reason for the faster growth is that we do not assume that 
maintenance costs are proportional to income — if such costs 
grow super-linearly, negative feedback could dampen growth. 
Other external factors, such as the fact that AS numbers are al- 
located and assigned by a central authority (Internet Assigned 
Numbers Authority, www.iana.org), might also influence the 
actual rate of growth experienced by the Internet. 



B. Network Growth 

We begin by studying the growth of the network over time. 
In Fig. [3ja) we plot the number of agents and links as a 
function of simulation time for one representative run. At 
t = to ~ 4 x 10 5 the graph is sparser than k^. Initially, the 
agents spend the budget they accumulate on new links (and 
increasing capacity). Around t ~ 1.5 x 10 6 , the budget of the 
wealthier agents is sufficient to invest in wires to new loca- 
tions (see Fig. [3fb)). This creates new traffic, which causes 
positive feedback accelerating the traffic flow, coverage, bud- 
get, and also more congestion. Around t ~ 1.9 x 10 6 , n(r) 
and m(j) change from exponential to sub-exponential growth. 
As we see below, this is also the time when a significant level 
of congestion appears in the system. At about the same time, 
the entire population is serviced by the network. With the cur- 
rent model, the network would grow indefinitely but with de- 
creasing returns for the agents. Alternatively one could intro- 



C. Degree distribution 

One of the most conspicuous network structures of AS- 
graphs is its skewed degree distribution (first observed in 
Ref. (0)), compatible with a power-law functional form @). 
In Fig. |3 a) we compare the cumulative degree distribution of 
our model with that of the Internet's. We use the model net- 
work from the example run described earlier (taking data from 
the simulation when N = 16,000), and the "AS06" network of 
Ref. (1191) (an AS-graph constructed from www.routeviews.org 
and www.ripe.net, with N - 22,688). The match between the 
model and the real networks is striking. Preliminary studies 
indicate that the slope of the curve is largely insensitive to 
changes in parameter values. We compare this result with a 
generic network model that produces power-law degree dis- 
tributions (the Barabasi-Albert (BA) model (3)) and a sim- 
ple, geographic model of the AS-graph designed by Fabrikant, 
Koutsoupias, and Papadimitriou (FKP) (1131) . 

The BA model is a growth model in which one node (and m 
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FIG. 4 The degree distribution (cumulative mass function) of a real 
AS-graph (AS06) together with degree distribution of a network gen- 
erated with the model (a), the BA (b) and the FKP models (c). Panel 
(d) is a density plot that illustrates the correlation between traffic and 
degree in our model runs. 



links to attach it with the rest of the network) is added every 
time step. Preferential attachment is used to determine the 
endpoints of the new links — the probability of attaching to a 
node of degree k is proportional to k. 

The FKP model is also a simple growth-model. Each time 
step, one node, and a link attached to it, is added to the 
graph. A new node i is assigned random coordinates in the 
unit square and attached to the old node j that minimizes 
do(j) + a l r ! _ r /l (where do(j) is the graph distance between 
j and the node added first, |r, - r,| is the Euclidean distance 
between i and j, and a is a parameter setting the cost-balance 
between making new physical connections or using the exist- 
ing network). 

In Figs. Ub) and (c) we plot the cumulative mass function 
of degree for one BA and one FKP network. The model pa- 
rameter values were chosen to give networks as close as pos- 
sible to the real AS-graph (m = 5 for the BA model, a — 4 
for the FKP model, and N = 22,688 for both). The slope of 
the BA model is steeper than the real network, and the curve 
for the FKP-model is flatter than the real data. To compare 
the goodness-of-fit, since the curves have a similar range in 
log pu, we measure the ratio 8 of the area between the curves 
and the area (in the logp^logfc-space) spanned by the ex- 
treme values of log k and log p%. We find 8 = 0.95% for our 
model, 4.0% for the BA model, and 11% for the FKP model. 
Although both the BA and FKP models hae been extended to 
yield better data fits (QJ |28|) . the original forms of the mod- 
els illustrate two important components of Internet growth, 
namely the rich-gets-richer effect driving the growth of the 
BA model and the spatial trade-off effect of the FKP model. 

A combination of these effects may explain why our 
model's degree distribution, and the curve of the real network, 
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FIG. 5 Radial statistics for real and model networks. Panels (a)- 
(c) show the radial densities of nodes for the real AS-graph and our 
algorithm (a), the BA (b) and FKP (c) model. Panels (d)-(f) show 
the average degree vs. average distance d for our algorithm, the BA, 
and the FKP model respectively. The data of panels (b), (c), (e), and 
(f) are plotted in Ref. il% as well. 



lies between those of the original BA and FKP models. In our 
model, the degrees of nodes do not directly affect the creation 
of new links. However, preferential attachment occurs indi- 
rectly via positive feedback — nodes with large degree acquire 
more traffic, and thus more budget which they can reinvest 
in more connections, thus increasing their degree. The effect 
of preferential attachment in the model is shown in Fig.[4fd), 
which is a plot of the probability density of a node's traffic 
load given its degree. Because an agent's income is correlated 
with the traffic that it propagates, and a larger budget will in- 
crease the possibility of creating new links, there is positive 
feedback between the degree and the rate of degree increase, 
i.e. a form of preferential attachment. Note that the correlation 
in Fig.Ud) is n °t linear (the slope is different from the solid 
line's). It is known that nonlinear preferential attachment does 
not give a power-law degree distribution (21) (which we seem 
to have), so preferential attachment is not the only factor af- 
fecting our network's growth. (If we had linear preferential 
attachment, the slope of P(k) would, furthermore, be the same 
as the BA model.) 



D. Radial structure 

Structually, the AS graph is hierarchically ordered 
engineers and network operators speak of the first, second and 
third tier. For the model networks, we measure a node's posi- 
tion in the hierarchy by its network centrality (0). In Fig. 
we diagram the average fraction of nodes and the average de- 
gree as functions of the average distance d to other nodes in 
the network (d is the inverse of a centrality measure, known 
as closeness centrality, so more central nodes are to the left 
in the diagrams). By this method we can get a radial picture 
of the AS graph structure from the center to the periphery. In 
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FIG. 6 Traffic patterns of the model, (a) displays the number of 
extra steps d + in packet navigation in the real Internet compared to 
our model. Panel (b) shows the probability density of agents having 
betweenness Cb and traffic density p. The data is collected from 
twenty independent runs. 

Fig. EJa)-(c) we plot the fraction of vertices at different d- 
values. We note that our model resembles the real AS-graph 
more closely than the BA and FKP models. Having peaks 
(roughly corresponding to the tiers of the Internet) like the 
observed AS-graph. The shift to the left of the model curve 
in Fig. |3a) can, to some extent, be explained by its smaller 
size (larger networks have larger average distances, leading to 
a curve displaced to the right). In brief, the BA model lacks 
the complex periphery of the real AS-graph (the density is 
more balanced, compared with the left-skewed curve of the 
real-world network). The average degree as a function of d is 
less right-skewed in the BA model compared with the empir- 
ical network. Just like the degree distribution, the FKP model 
deviates from the real network in the opposite way compared 
to the BA model — the high degree nodes of the FKP model 
are extremely concentrated to the center of the network. 



E. Traffic flow and congestion patterns 

In Section IIII.BI we investigated network topology and its 
growth. In this section we study traffic flow and how network 
topology affects it. In the Internet, packets do not necessar- 
ily travel the shortest distances between source and destina- 
tion. Most importantly, business agreements between agents 
arrange agents into a hierarchy (U5I) . The business contracts 



put constraints on how packets are routed — for example, usu- 
ally a packet cannot first be routed downwards (to customers), 
then upwards (to providers), in the hierarchy, even if that 
is a shorter path (known as the valley free rule). Gao and 
Wang (16) investigated the extra distance d+ packets need to 
travel due to such reasons. They found a decaying probabil- 
ity distribution of d + , meaning that most of the traffic actually 
travels via shortest paths. In our model we do not have ex- 
plicit business agreements that cause hierarchical routing into 
the core of the network, and out again. It is, however, true 
for most graphs that a vast majority of shortest paths pass a 
restricted core of the graph dl7l) . and our traffic model routes 
most traffic via short (if not the shortest) paths. The d+ distri- 
bution of our model (shown in Fig.|6ja)) matches the observa- 
tion of Gao and Wang (llfjl) . 

We proceed to investigate the relationship between graph 
centrality and traffic density. This can tell us something about 
how congestion and fluctuations affect routing (18). If all 
agents have sufficient capacity for packets to always route 
along shortest paths, then traffic density along a link I will 
be proportional to its betweenness centrality 

c B ^=Y^o- l (i,j)jY J o-{i,j) (i) 

U 'J 

where cr/(z, j) is the number of shortest paths between nodes i 
and j passing through the link /, and <r{i, j) is the total number 
of shortest paths between i and j. If an AS is congested, the 
traffic through its links will be lower than anticipated by the 
betweenness of the edge. Thus, congestion patterns can be il- 
lustrated by studying betweenness and traffic load. Fig. |6jb) 
is a density plot of the actual traffic density as a function of 
betweenness of the links of the model network. For more 
central nodes (higher betweenness), there is a strong corre- 
lation between betweenness and traffic density — the vertices 
with Cb ~ 4 x 10 5 spans half a decade of p. For the more 
peripheral nodes the correlation is less clear (vertices with 
Cb ~ 5 x 10 4 can have p-values of almost three orders of 
magnitude). Indeed, there seems to be a separation of agents 
into two classes, one with capacity to keep the traffic flowing, 
another with too low capacity. For links of low betweenness 
the traffic-betweenness correlation is weak. To summarize, 
congestion does affect the system, and it is most pronounced 
for nodes carrying little, or intermediate, traffic levels. 



F. Geographic structure 

We briefly discuss the spatial network structure — another 
feature that emerges from our model. As an example, we ran 
the simulation on the population density profile of the United 
States. In Fig. 0a)-(d) we show the growth of the largest 
agent for a run with no - 20, L x = 513 and L y - 323. Lines 
are drawn between each node (pixel) and the agent's nearest 
node at the time of the node's addition. In this representa- 
tion the length of the lines are proportional to the wire cost. 
Fig- I3 e ) and (f) plot the locations of Tier 1 exchange points of 
two major Internet providers Sprint and AT&T (adapted from 




100 200 300 

population density (people per square mile) 




AT&T 



FIG. 7 The spatial expansion of a single agent with the US population density as model input. The simulation parameters are the same as the 
rest of the paper, except n D = 20, L x = 513 and L, = 323. Panels (e) and (f) represent the points of presence of AT&T and Sprint within the 
United States. This data was adapted from Ref. ( 1260 . 



Ref. (26)). There are some similarities between these real net- 
works and the model network of Fig.|7jd) — all networks span 
the whole continent and have locations concentrated in urban 
areas. In future work we intend to make a statistical charac- 
terization of the spatial aspects of the networks produced by 
our model. 



IV. DISCUSSION 

We have presented a model of communication networks 
that, like the AS-level Internet, is built of spatially extended 
subnetworks that have an interest in increasing the traffic run- 
ning through them. Our model networks grow slowly until 
they reach a critical mass where an approximately exponential 
growth begins; they match the degree distribution of real net- 
works and the radial statistics closely. The degree distribution 
of the model, and the real world lies between the distributions 
of the pure BA and FKP models. Since the model incorporates 
aspects of both the BA and FKP models we hypothesize that, 
the explanation for the degree distribution of the model, and 
the real world, is a combined result of preferential attachment 
(of the BA model) and geographically constrained optimiza- 
tion (of the FKP model). We are able to recreate the traffic 
characteristic observed in real Internet traffic. If we run the 
model on the US population density map many features of the 
backbone of large, real agents are recreated. 

The different aspects of the model (traffic, geography, and 
agents trying to increase the traffic they relay) all affect the 
output. In this paper we do not scrutinize the model's parame- 



ter dependence, although preliminary studies indicate that the 
speed of growth (quantified by e.g. the time to reach the criti- 
cal density) is strongly dependent on both the wire and attach- 
ment prices, the population density profile (a more clumped 
population distribution produces faster growth), and their de- 
sire to communicate. On the other hand, the network topology 
is rather insensitive to the population distribution, and also not 
very dependent on how sources and destinations are gener- 
ated (e.g., introducing a distance dependence does not matter 
much). The specific layout of the network is, however, depen- 
dent on population profile. 

Many interesting extensions of the basic model are possi- 
ble. One interesting extension would, for example, be to in- 
clude business agreements between the different agents (sim- 
ilar to Ref. (HI; HH)), or change the traffic patterns from the 
person-to-person communication of the present model to a 
situation with more traffic coming from central servers. It 
might also be interesting to model intra-AS routing. Many of 
today's ASes employ "hot-potato" routing and transfer pack- 
ets to the next AS as quickly as possible, to reduce cost. Alter- 
native intra-AS routing strategies, such as routing the packet 
as close to the destination as possible, could be tested within 
the model's framework. 
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